Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting
نویسندگان
چکیده
A multi-species approach to fisheries management requires taking into account the interactions between species in order to improve recruitment forecasting of the fish species. Recent advances in Bayesian networks direct the learning of models with several interrelated variables to be forecasted simultaneously. These models are known as multi-dimensional Bayesian network classifiers (MDBNs). Preprocessing steps are critical for the posterior learning of the model in these kinds of domains. Therefore, in the present study, a set of ‘state-of-the-art’ uni-dimensional pre-processing methods, within the categories of missing data imputation, feature discretization and feature subset selection, are adapted to be used with MDBNs. A framework that includes the proposed multi-dimensional supervised preprocessing methods, coupled with a MDBN classifier, is tested with synthetic datasets and the real domain of fish recruitment forecasting. The correctly forecasting of three fish species (anchovy, sardine and hake) simultaneously is doubled (from 17.3% to 29.5%) using the multi-dimensional approach in comparison to mono-species models. The probability assessments also show high improvement reducing the average error (estimated by means of Brier score) from 0.35 to 0.27. Finally, these differences are superior to the forecasting of species by pairs. 2012 Elsevier Ltd. All rights reserved.
منابع مشابه
Data analysis advances in marine science for fisheries management: Supervised classification applications
I Summary The impact of how fisheries are managed is of great importance on biological , economic, social and political levels. However, there is still a high uncertainty about the relationships between climate, fish and management decisions. Many activities are performed in marine science in order to reduce this uncertainty. This dissertation provides methodological contributions to several of...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملIndoor Positioning and Pre-processing of RSS Measurements
Rapid expansions of new location-based services signify the need for finding accurate localization techniques for indoor environments. Among different techniques, RSS-based schemes and in particular oneof its variants which is based on Graph-based Semi-Supervised Learning (G-SSL) are widely-used approaches The superiority of this scheme is that it has low setup/training cost and at the same ti...
متن کاملHyperspectral Image Classification Based on the Fusion of the Features Generated by Sparse Representation Methods, Linear and Non-linear Transformations
The ability of recording the high resolution spectral signature of earth surface would be the most important feature of hyperspectral sensors. On the other hand, classification of hyperspectral imagery is known as one of the methods to extracting information from these remote sensing data sources. Despite the high potential of hyperspectral images in the information content point of view, there...
متن کاملDetecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Environmental Modelling and Software
دوره 40 شماره
صفحات -
تاریخ انتشار 2013